

## **ECE3700J Introduction to Computer Organization**

## Homework 6

Assigned: October November 15, 2022

Due: 2:00pm on November 22, 2022

**Submit a PDF file on Canvas** 

1. (10 points) The following code is written in C, where elements within the same row are stored contiguously. Assume each word is a 32-bit integer.

(1) Which variable references exhibit temporal locality? (5 points)

Answer: I, J, B[I][0]

(2) Which variable references exhibit spatial locality? (5 points)

Answer: A[I][J]

- 2. (40 points) Below is a list of 32-bit memory address references, given as word addresses: 0x03, 0xB4, 0x2B, 0x02, 0xBF, 0x58, 0xBE, 0x0E, 0xB5, 0x2C, 0xBA, 0xFD
  - (1) For each of these references, identify the tag and the cache index given a direct-mapped cache with 8 one-word blocks. Also list if each reference is a hit or a miss, assuming the cache is initially empty. (10 points)

| Word Address | Index | Tag (27 bits) | Hit or Miss |
|--------------|-------|---------------|-------------|
| 0x0000_0011  | 011   | 00000         | M           |
| 0x1011_0100  | 100   | 10110         | M           |
| 0x0010_1011  | 011   | 00101         | M           |
| 0x0000_0010  | 010   | 00000         | M           |
| 0x1011_1111  | 111   | 10111         | M           |
| 0x0101_1000  | 000   | 01011         | M           |
| 0x1011_1110  | 110   | 10111         | M           |
| 0x0000_1110  | 110   | 00001         | M           |
| 0x1011_0101  | 101   | 10110         | M           |
| 0x0010_1100  | 100   | 00101         | M           |
| 0x1011_1010  | 010   | 10111         | M           |
| 0x1111_1101  | 101   | 11111         | M           |

| Index | Valid | Tag   | Data |
|-------|-------|-------|------|
| 000   | 1     | 01011 |      |
| 001   | 0     |       |      |
| 010   | 1     | 10111 |      |
| 011   | 1     | 00101 |      |
| 100   | 1     | 00101 |      |
| 101   | 1     | 11111 |      |
| 110   | 1     | 00001 |      |
| 111   | 1     | 10111 |      |

(2) For each of these references, identify the tag and the cache index given a direct-mapped cache with two-word blocks and a total size of 4 blocks. Also list if each reference is a hit or a miss, assuming the cache is initially empty. (10 pints)

| Word Address | Index | Tag (27 bits) | Hit or Miss |
|--------------|-------|---------------|-------------|
| 0x0000_0011  | 01    | 00000         | M           |
| 0x1011_0100  | 10    | 10110         | M           |
| 0x0010_1011  | 01    | 00101         | M           |
| 0x0000_0010  | 01    | 00000         | M           |
| 0x1011_1111  | 11    | 10111         | M           |
| 0x0101_1000  | 00    | 01011         | M           |
| 0x1011_1110  | 11    | 10111         | Н           |
| 0x0000_1110  | 11    | 00001         | M           |
| 0x1011_0101  | 10    | 10110         | Н           |
| 0x0010_1100  | 10    | 00101         | M           |
| 0x1011_1010  | 01    | 10111         | M           |
| 0x1111_1101  | 10    | 11111         | M           |

| Index | Valid | Tag   | Word 0 | Word 1 |
|-------|-------|-------|--------|--------|
| 00    | 1     | 01011 | 0x58   | 0x59   |
| 01    | 1     | 10111 | 0x02   | 0x03   |
| 10    | 1     | 11111 | 0xFC   | 0xFD   |
| 11    | 1     | 00001 | 0x0E   | 0x0F   |

(3) You are asked to optimize a cache design for the given references. There are three direct-mapped cache designs possible, all with a total of 8 words of data: C1 has 1-word blocks, C2 has 2-word blocks, and C3 has 4-word blocks. In terms of miss rate, which cache design is the best? If the miss stall time is 35 cycles, and C1 has an access time of 2 cycles, C2 takes 3 cycles, and C3 takes 5 cycles, which is the best cache design? (20 points)

| Word Address | Index | Tag (27 bits) | Hit or Miss |
|--------------|-------|---------------|-------------|
| 0x0000_0011  | 0     | 00000         | M           |
| 0x1011_0100  | 1     | 10110         | M           |
| 0x0010_1011  | 0     | 00101         | M           |
| 0x0000_0010  | 0     | 00000         | M           |
| 0x1011_1111  | 1     | 10111         | M           |
| 0x0101_1000  | 0     | 01011         | M           |
| 0x1011_1110  | 1     | 10111         | Н           |
| 0x0000_1110  | 1     | 00001         | M           |
| 0x1011_0101  | 1     | 10110         | M           |
| 0x0010_1100  | 1     | 00101         | M           |
| 0x1011_1010  | 0     | 10111         | M           |
| 0x1111_1101  | 1     | 11111         | M           |

| Index | Valid | Tag   | Word 0 | Word 1 | Word 2 | Word 3 |
|-------|-------|-------|--------|--------|--------|--------|
| 0     | 1     | 01011 | 0xB8   | 0xB9   | 0xBA   | 0xBB   |
| 1     | 1     | 11111 | 0xFC   | 0xFD   | 0xFE   | 0xFF   |

In terms of miss rate, C1's miss rate is 100%, C2's miss rate is 83.3%, C3's miss rate is 91.7%. C2 is the best.

## In terms of total time:

For C1: Total time = 12 \* 2 cycles + 12 \* 35 cycles = 444 cycles For C2: Total time = 12 \* 3 cycles + 10 \* 35 cycles = 386 cycles For C3: Total time = 12 \* 5 cycles + 11 \* 35 cycles = 445 cycles

C2 is the best design.

3. (50 points) For a direct-mapped cache design with a 32-bit byte address, the following bits of the address are used to access the cache.

| Tag     | Index | Offset |
|---------|-------|--------|
| 31 - 10 | 9 - 5 | 4 - 0  |

(1) What is the cache block size (in words)? (5 points)

Answer: 8 words

(2) How many blocks does the cache have? (5 points)

Answer: 32 blocks

(3) What is the ratio between total bits required for such a cache implementation over the data storage bits? (5 points)

Answer: Assume only valid bit is implemented.

Each block has 1 valid and 21 bits of tag, total number of bits = 32 \* (21 + 1) = 704 bits

Each block has 8 words of data, total bits = 32 \* 8 \* 32 = 8192 bits

Ratio = 704 / 8192 = 8.59%

Beginning from power on, the following byte addresses for cache references are recorded.

| Address |      |      |      |      |      |       |      |      |       |      |       |
|---------|------|------|------|------|------|-------|------|------|-------|------|-------|
| 0x00    | 0x04 | 0x10 | 0x84 | 0xE8 | 0xA0 | 0x400 | 0x1E | 0x8C | 0xC1C | 0xB4 | 0x884 |

- (4) (20 points) For each reference, list
  - a) its tag, index, and offset
  - b) whether it is a hit or a miss, and
  - c) How many blocks were replaced (if any)?

| Byte Address     | Tag (21 bits) | Index | Offset | Hit or Miss | Blocks replaced |
|------------------|---------------|-------|--------|-------------|-----------------|
| 0x0000_0000      | 0             | 00000 | 00000  | M           | 0               |
| 0x0000_0100      | 0             | 00000 | 00100  | H           | 0               |
| 0x0001_0000      | 0             | 00000 | 10000  | H           | 0               |
| 0x1000_0100      | 0             | 00100 | 00100  | M           | 0               |
| 0x1110_1000      | 0             | 00111 | 01000  | M           | 0               |
| 0x0100_0000_0000 | 001           | 00000 | 00000  | M           | 1               |
| 0x0001_1110      | 0             | 00000 | 11110  | M           | 1               |
| 0x1000_1100      | 0             | 00100 | 01100  | H           | 0               |
| 0x1100_0001_1100 | 0011          | 00000 | 11100  | M           | 1               |
| 0x1011_0100      | 0             | 00101 | 10100  | M           | 0               |
| 0x1000_1000_0100 | 0010          | 00100 | 00100  | M           | 1               |

(5) What is the hit ratio? (5 points)

Answer: hit ratio = 3/11 = 27.3%



(6) Show the final state of the cache, with each valid line represented as <index, tag, data>. (10 points)

| Index | Valid | Tag  | W0    | W1    | W2    | W3    | W4    | W5    | W6    | W7    |
|-------|-------|------|-------|-------|-------|-------|-------|-------|-------|-------|
| 00000 | 1     | 0011 | 0xC00 | 0xC04 | 0xC08 | 0xC0C | 0xC10 | 0xC14 | 0xC18 | 0xC1C |
| 00001 | 0     |      |       |       |       |       |       |       |       |       |
| 00010 | 0     |      |       |       |       |       |       |       |       |       |
| 00011 | 0     |      |       |       |       |       |       |       |       |       |
| 00100 | 1     | 0010 | 0x880 | 0x884 | 0x888 | 0x88C | 0x890 | 0x894 | 0x898 | 0x89C |
| 00101 | 1     | 0    | 0xA0  | 0xA4  | 0xA8  | 0xAC  | 0xB0  | 0xB4  | 0xB8  | 0xBC  |
| 00110 | 0     |      |       |       |       |       |       |       |       |       |
| 00111 | 1     | 0    | 0xE0  | 0xE4  | 0xE8  | 0xEC  | 0xF0  | 0xF4  | 0xF8  | 0xFC  |
| 01000 | 0     |      |       |       |       |       |       |       |       |       |
| 01001 | 0     |      |       |       |       |       |       |       |       |       |
| 01010 | 0     |      |       |       |       |       |       |       |       |       |
| 01011 | 0     |      |       |       |       |       |       |       |       |       |
| 01100 | 0     |      |       |       |       |       |       |       |       |       |
| 01101 | 0     |      |       |       |       |       |       |       |       |       |
| 01110 | 0     |      |       |       |       |       |       |       |       |       |
| 01111 | 0     |      |       |       |       |       |       |       |       |       |
| 10000 | 0     |      |       |       |       |       |       |       |       |       |
| 10001 | 0     |      |       |       |       |       |       |       |       |       |
| 10010 | 0     |      |       |       |       |       |       |       |       |       |
| 10011 | 0     |      |       |       |       |       |       |       |       |       |
| 10100 | 0     |      |       |       |       |       |       |       |       |       |
| 10101 | 0     |      |       |       |       |       |       |       |       |       |
| 10110 | 0     |      |       |       |       |       |       |       |       |       |
| 10111 | 0     |      |       |       |       |       |       |       |       |       |
| 11000 | 0     |      |       |       |       |       |       |       |       |       |
| 11001 | 0     |      |       |       |       |       |       |       |       |       |
| 11010 | 0     |      |       |       |       |       |       |       |       |       |
| 11011 | 0     |      |       |       |       |       |       |       |       |       |
| 11100 | 0     |      |       |       |       |       |       |       |       |       |
| 11101 | 0     |      |       |       |       |       |       |       |       |       |
| 11110 | 0     |      |       |       |       |       |       |       |       |       |
| 11111 | 0     |      |       |       |       |       |       |       |       |       |